performance score
Emergent Collective Memory in Decentralized Multi-Agent AI Systems
We demonstrate how collective memory emerges in decentralized multi-agent systems through the interplay between individual agent memory and environmental trace communication. Our agents maintain internal memory states while depositing persistent environmental traces, creating a spatially distributed collective memory without centralized control. Comprehensive validation across five environmental conditions (20x20 to 50x50 grids, 5-20 agents, 50 runs per configuration) reveals a critical asymmetry: individual memory alone provides 68.7% performance improvement over no-memory baselines (1563.87 vs 927.23, p < 0.001), while environmental traces without memory fail completely. This demonstrates that memory functions independently but traces require cognitive infrastructure for interpretation. Systematic density-sweep experiments (rho in [0.049, 0.300], up to 625 agents) validate our theoretical phase transition prediction. On realistic large grids (30x30, 50x50), stigmergic coordination dominates above rho ~ 0.20, with traces outperforming memory by 36-41% on composite metrics despite lower food efficiency. The experimental crossover confirms the predicted critical density rho_c = 0.230 within 13% error.
- Europe > Germany > Baden-Württemberg > Freiburg (0.40)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
Hybrid Deep Learning Model to Estimate Cognitive Effort from fNIRS Signals
Sharmin, Shayla, Barmaki, Roghayeh Leila
This study estimates cognitive effort based on functional near-infrared spectroscopy data and performance scores using a hybrid DeepNet model. The estimation of cognitive effort enables educators to modify material to enhance learning effectiveness and student engagement. In this study, we collected oxygenated hemoglobin using functional near-infrared spectroscopy during an educational quiz game. Participants (n=16) responded to 16 questions in a Unity-based educational game, each within a 30-second response time limit. We used DeepNet models to predict the performance score from the oxygenated hemoglobin, and compared traditional machine learning and DeepNet models to determine which approach provides better accuracy in predicting performance scores. The result shows that the proposed CNN-GRU gives better performance with 73% than other models. After the prediction, we used the predicted score and the oxygenated hemoglobin to observe cognitive effort by calculating relative neural efficiency and involvement in our test cases. Our result shows that even with moderate accuracy, the predicted cognitive effort closely follow the actual trends. This findings can be helpful in designing and improving learning environments and provide valuable insights into learning materials.
- North America > United States > Delaware > New Castle County > Newark (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Oceania > Australia > Australian Capital Territory > Canberra (0.05)
- (3 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Education > Educational Technology > Educational Software > Computer Based Training (0.34)
Tuning the Tuner: Introducing Hyperparameter Optimization for Auto-Tuning
Willemsen, Floris-Jan, van Nieuwpoort, Rob V., van Werkhoven, Ben
Abstract--Automatic performance tuning (auto-tuning) is widely used to optimize performance-critical applications across many scientific domains by finding the best program variant among many choices. Efficient optimization algorithms are crucial for navigating the vast and complex search spaces in auto-tuning. As is well known in the context of machine learning and similar fields, hyperparameters critically shape optimization algorithm efficiency. Y et for auto-tuning frameworks, these hyperparameters are almost never tuned, and their potential performance impact has not been studied. We present a novel method for general hyperparameter tuning of optimization algorithms for auto-tuning, thus "tuning the tuner". In particular, we propose a robust statistical method for evaluating hyperparameter performance across search spaces, publish a F AIR data set and software for reproducibility, and present a simulation mode that replays previously recorded tuning data, lowering the costs of hyperparameter tuning by two orders of magnitude. We show that even limited hyperparam-eter tuning can improve auto-tuner performance by 94.8% on average, and establish that the hyperparameters themselves can be optimized efficiently with meta-strategies (with an average improvement of 204.7%), demonstrating the often overlooked hyperparameter tuning as a powerful technique for advancing auto-tuning research and practice. UTOMA TIC performance tuning, or auto-tuning, is a widely established method for optimizing the performance of applications in many scientific domains, including radio astronomy [1]-[4], image processing [5]-[7], fluid dynamics [8]-[10], and climate modeling [11]-[13]. Auto-tuning automates the process of exploring the myriad of implementation choices that arise in performance optimization, such as the number of threads, tile sizes used in loop blocking, and other code optimization parameters [14]. At the heart of the auto-tuning method is a search space of functionally-equivalent code variants that is explored by an optimization algorithm.
- Europe > Netherlands > South Holland > Leiden (0.05)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Finland (0.04)
- Overview (0.93)
- Research Report > Promising Solution (0.34)
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
Beyond the high score: Prosocial ability profiles of multi-agent populations
Tesic, Marko, Zhao, Yue, Leibo, Joel Z., Trivedi, Rakshit S., Hernandez-Orallo, Jose
The development and evaluation of social capabilities in AI agents require complex environments where competitive and cooperative behaviours naturally emerge. While game-theoretic properties can explain why certain teams or agent populations outperform others, more abstract behaviours, such as convention following, are harder to control in training and evaluation settings. The Melting Pot contest is a social AI evaluation suite designed to assess the cooperation capabilities of AI systems. In this paper, we apply a Bayesian approach known as Measurement Layouts to infer the capability profiles of multi-agent systems in the Melting Pot contest. We show that these capability profiles not only predict future performance within the Melting Pot suite but also reveal the underlying prosocial abilities of agents. Our analysis indicates that while higher prosocial capabilities sometimes correlate with better performance, this is not a universal trend-some lower-scoring agents exhibit stronger cooperation abilities. Furthermore, we find that top-performing contest submissions are more likely to achieve high scores in scenarios where prosocial capabilities are not required. These findings, together with reports that the contest winner used a hard-coded solution tailored to specific environments, suggest that at least one top-performing team may have optimised for conditions where cooperation was not necessary, potentially exploiting limitations in the evaluation framework. We provide recommendations for improving the annotation of cooperation demands and propose future research directions to account for biases introduced by different testing environments. Our results demonstrate that Measurement Layouts offer both strong predictive accuracy and actionable insights, contributing to a more transparent and generalisable approach to evaluating AI systems in complex social settings.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
Motif 2.6B Technical Report
Lim, Junghwan, Lee, Sungmin, Kim, Dongseok, Park, Eunhwan, Park, Hyunbyung, Lee, Junhyeok, Cheung, Wai Ting, Choi, Dahye, Her, Jaeheui, Huh, Jaeyeon, Jung, Hanbin, Kang, Changjin, Kim, Beomgyu, Kim, Jihwan, Kim, Minjae, Kim, Taehwan, Kim, Youngrok, Lee, Haesol, Lee, Jeesoo, Lee, Kungyu, Oh, Dongpin, Park, Yeongjae, Ryu, Bokki, Suh, Daewon, Weon, Dongjoo
Recent advancements in Large Language Models (LLMs) have revolutionized artificial intelligence, yet developing an effective foundational LLM that balances high performance with computational efficiency remains challenging, especially for emerging research groups. To address this gap, we introduce Motif-2.6B, a 2.6-billion-parameter foundation model designed to democratize advanced LLM capabilities. Motif-2.6B incorporates several innovative architectural enhancements, including Differential Attention and PolyNorm activation functions, which improve long-context comprehension, reduce hallucination, and enhance in-context learning capabilities. We rigorously tested multiple novel architectural components through extensive experimentation to determine the optimal architecture for Motif-2.6B. Comprehensive evaluations demonstrate that Motif-2.6B consistently meets or exceeds the performance of similarly sized state-of-the-art models across diverse benchmarks, showcasing its effectiveness, scalability, and real-world applicability. Through detailed experiments and tailored techniques, Motif-2.6B significantly advances the landscape of efficient, scalable, and powerful foundational LLMs, offering valuable insights and a robust foundation for future research and deployment.
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.34)
Finetuning Deep Reinforcement Learning Policies with Evolutionary Strategies for Control of Underactuated Robots
Calì, Marco, Sinigaglia, Alberto, Turcato, Niccolò, Carli, Ruggero, Susto, Gian Antonio
Deep Reinforcement Learning (RL) has emerged as a powerful method for addressing complex control problems, particularly those involving underactuated robotic systems. However, in some cases, policies may require refinement to achieve optimal performance and robustness aligned with specific task objectives. In this paper, we propose an approach for fine-tuning Deep RL policies using Evolutionary Strategies (ES) to enhance control performance for underactuated robots. Our method involves initially training an RL agent with Soft-Actor Critic (SAC) using a surrogate reward function designed to approximate complex specific scoring metrics. We subsequently refine this learned policy through a zero-order optimization step employing the Separable Natural Evolution Strategy (SNES), directly targeting the original score. Experimental evaluations conducted in the context of the 2nd AI Olympics with RealAIGym at IROS 2024 demonstrate that our evolutionary fine-tuning significantly improves agent performance while maintaining high robustness. The resulting controllers outperform established baselines, achieving competitive scores for the competition tasks.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Italy (0.04)